Overview

Brought to you by YData

Dataset statistics

Number of variables21
Number of observations10800
Missing cells15325
Missing cells (%)6.8%
Duplicate rows206
Duplicate rows (%)1.9%
Total size in memory11.4 MiB
Average record size in memory1.1 KiB

Variable types

Text7
DateTime2
Categorical7
Numeric5

Alerts

Country has constant value "United States" Constant
Dataset has 206 (1.9%) duplicate rowsDuplicates
Category is highly overall correlated with Sub-CategoryHigh correlation
Discount is highly overall correlated with ProfitHigh correlation
Postal Code is highly overall correlated with Region and 1 other fieldsHigh correlation
Profit is highly overall correlated with Discount and 1 other fieldsHigh correlation
Region is highly overall correlated with Postal Code and 1 other fieldsHigh correlation
Sales is highly overall correlated with ProfitHigh correlation
State is highly overall correlated with Postal Code and 1 other fieldsHigh correlation
Sub-Category is highly overall correlated with CategoryHigh correlation
Order Date has 806 (7.5%) missing values Missing
Ship Date has 806 (7.5%) missing values Missing
Ship Mode has 806 (7.5%) missing values Missing
Customer ID has 806 (7.5%) missing values Missing
Customer Name has 806 (7.5%) missing values Missing
Segment has 806 (7.5%) missing values Missing
Country has 806 (7.5%) missing values Missing
City has 806 (7.5%) missing values Missing
State has 806 (7.5%) missing values Missing
Postal Code has 817 (7.6%) missing values Missing
Region has 806 (7.5%) missing values Missing
Product ID has 806 (7.5%) missing values Missing
Category has 806 (7.5%) missing values Missing
Sub-Category has 806 (7.5%) missing values Missing
Product Name has 806 (7.5%) missing values Missing
Sales has 806 (7.5%) missing values Missing
Quantity has 806 (7.5%) missing values Missing
Discount has 806 (7.5%) missing values Missing
Profit has 806 (7.5%) missing values Missing
Discount has 4798 (44.4%) zeros Zeros

Reproduction

Analysis started2025-02-19 00:53:46.951001
Analysis finished2025-02-19 00:54:07.699370
Duration20.75 seconds
Software versionydata-profiling vv4.12.2
Download configurationconfig.json

Variables

Row ID
Text

Distinct10001
Distinct (%)92.6%
Missing0
Missing (%)0.0%
Memory size641.7 KiB
2025-02-19T07:54:10.014096image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length17
Median length4
Mean length3.8275926
Min length1

Characters and Unicode

Total characters41338
Distinct characters38
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)92.6%

Sample

1st row1
2nd row2
3rd row3
4th row4
5th row5
ValueCountFrequency (%)
yes 800
 
7.4%
8 1
 
< 0.1%
19 1
 
< 0.1%
18 1
 
< 0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
6 1
 
< 0.1%
7 1
 
< 0.1%
9 1
 
< 0.1%
Other values (9995) 9995
92.5%
2025-02-19T07:54:13.085915image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4000
9.7%
4 4000
9.7%
2 4000
9.7%
3 4000
9.7%
6 3999
9.7%
7 3999
9.7%
5 3999
9.7%
8 3999
9.7%
9 3984
9.6%
0 2889
7.0%
Other values (28) 2469
6.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 41338
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 4000
9.7%
4 4000
9.7%
2 4000
9.7%
3 4000
9.7%
6 3999
9.7%
7 3999
9.7%
5 3999
9.7%
8 3999
9.7%
9 3984
9.6%
0 2889
7.0%
Other values (28) 2469
6.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 41338
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 4000
9.7%
4 4000
9.7%
2 4000
9.7%
3 4000
9.7%
6 3999
9.7%
7 3999
9.7%
5 3999
9.7%
8 3999
9.7%
9 3984
9.6%
0 2889
7.0%
Other values (28) 2469
6.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 41338
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 4000
9.7%
4 4000
9.7%
2 4000
9.7%
3 4000
9.7%
6 3999
9.7%
7 3999
9.7%
5 3999
9.7%
8 3999
9.7%
9 3984
9.6%
0 2889
7.0%
Other values (28) 2469
6.0%
Distinct5015
Distinct (%)46.4%
Missing0
Missing (%)0.0%
Memory size748.9 KiB
2025-02-19T07:54:14.098597image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length14
Median length14
Mean length13.99537
Min length4

Characters and Unicode

Total characters151150
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2454 ?
Unique (%)22.7%

Sample

1st rowCA-2017-152156
2nd rowCA-2017-152156
3rd rowCA-2017-138688
4th rowUS-2016-108966
5th rowUS-2016-108966
ValueCountFrequency (%)
ca-2018-100111 28
 
0.3%
ca-2017-165330 22
 
0.2%
ca-2016-164882 18
 
0.2%
ca-2015-142769 16
 
0.1%
us-2018-118087 16
 
0.1%
ca-2018-161956 16
 
0.1%
ca-2018-119284 14
 
0.1%
ca-2017-145261 14
 
0.1%
ca-2018-166093 14
 
0.1%
ca-2015-160766 14
 
0.1%
Other values (5006) 10629
98.4%
2025-02-19T07:54:15.139239image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 27531
18.2%
- 21588
14.3%
0 16746
11.1%
2 16601
11.0%
C 8978
 
5.9%
A 8977
 
5.9%
6 8067
 
5.3%
8 8015
 
5.3%
5 7780
 
5.1%
7 7236
 
4.8%
Other values (25) 19631
13.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 151150
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 27531
18.2%
- 21588
14.3%
0 16746
11.1%
2 16601
11.0%
C 8978
 
5.9%
A 8977
 
5.9%
6 8067
 
5.3%
8 8015
 
5.3%
5 7780
 
5.1%
7 7236
 
4.8%
Other values (25) 19631
13.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 151150
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 27531
18.2%
- 21588
14.3%
0 16746
11.1%
2 16601
11.0%
C 8978
 
5.9%
A 8977
 
5.9%
6 8067
 
5.3%
8 8015
 
5.3%
5 7780
 
5.1%
7 7236
 
4.8%
Other values (25) 19631
13.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 151150
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 27531
18.2%
- 21588
14.3%
0 16746
11.1%
2 16601
11.0%
C 8978
 
5.9%
A 8977
 
5.9%
6 8067
 
5.3%
8 8015
 
5.3%
5 7780
 
5.1%
7 7236
 
4.8%
Other values (25) 19631
13.0%

Order Date
Date

Missing 

Distinct1236
Distinct (%)12.4%
Missing806
Missing (%)7.5%
Memory size84.5 KiB
Minimum2015-01-03 00:00:00
Maximum2018-12-30 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-02-19T07:54:15.663142image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:54:16.558627image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Ship Date
Date

Missing 

Distinct1334
Distinct (%)13.3%
Missing806
Missing (%)7.5%
Memory size84.5 KiB
Minimum2015-01-07 00:00:00
Maximum2019-01-05 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-02-19T07:54:16.926817image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:54:17.261272image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Ship Mode
Categorical

Missing 

Distinct4
Distinct (%)< 0.1%
Missing806
Missing (%)7.5%
Memory size713.1 KiB
Standard Class
5968 
Second Class
1945 
First Class
1538 
Same Day
 
543

Length

Max length14
Median length14
Mean length12.823094
Min length8

Characters and Unicode

Total characters128154
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSecond Class
2nd rowSecond Class
3rd rowSecond Class
4th rowStandard Class
5th rowStandard Class

Common Values

ValueCountFrequency (%)
Standard Class 5968
55.3%
Second Class 1945
 
18.0%
First Class 1538
 
14.2%
Same Day 543
 
5.0%
(Missing) 806
 
7.5%

Length

2025-02-19T07:54:17.753811image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-19T07:54:18.114889image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
class 9451
47.3%
standard 5968
29.9%
second 1945
 
9.7%
first 1538
 
7.7%
same 543
 
2.7%
day 543
 
2.7%

Most occurring characters

ValueCountFrequency (%)
a 22473
17.5%
s 20440
15.9%
d 13881
10.8%
9994
7.8%
l 9451
7.4%
C 9451
7.4%
S 8456
 
6.6%
n 7913
 
6.2%
r 7506
 
5.9%
t 7506
 
5.9%
Other values (8) 11083
8.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 128154
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 22473
17.5%
s 20440
15.9%
d 13881
10.8%
9994
7.8%
l 9451
7.4%
C 9451
7.4%
S 8456
 
6.6%
n 7913
 
6.2%
r 7506
 
5.9%
t 7506
 
5.9%
Other values (8) 11083
8.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 128154
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 22473
17.5%
s 20440
15.9%
d 13881
10.8%
9994
7.8%
l 9451
7.4%
C 9451
7.4%
S 8456
 
6.6%
n 7913
 
6.2%
r 7506
 
5.9%
t 7506
 
5.9%
Other values (8) 11083
8.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 128154
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 22473
17.5%
s 20440
15.9%
d 13881
10.8%
9994
7.8%
l 9451
7.4%
C 9451
7.4%
S 8456
 
6.6%
n 7913
 
6.2%
r 7506
 
5.9%
t 7506
 
5.9%
Other values (8) 11083
8.6%

Customer ID
Text

Missing 

Distinct793
Distinct (%)7.9%
Missing806
Missing (%)7.5%
Memory size659.7 KiB
2025-02-19T07:54:18.863068image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters79952
Distinct characters40
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowCG-12520
2nd rowCG-12520
3rd rowDV-13045
4th rowSO-20335
5th rowSO-20335
ValueCountFrequency (%)
wb-21850 37
 
0.4%
jl-15835 34
 
0.3%
ma-17560 34
 
0.3%
pp-18955 34
 
0.3%
jd-15895 32
 
0.3%
eh-13765 32
 
0.3%
ck-12205 32
 
0.3%
sv-20365 32
 
0.3%
ep-13915 31
 
0.3%
zc-21910 31
 
0.3%
Other values (783) 9665
96.7%
2025-02-19T07:54:19.977224image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 11915
14.9%
- 9994
12.5%
0 8532
 
10.7%
5 7865
 
9.8%
2 4682
 
5.9%
7 2931
 
3.7%
6 2909
 
3.6%
9 2904
 
3.6%
8 2818
 
3.5%
3 2779
 
3.5%
Other values (30) 22623
28.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 79952
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 11915
14.9%
- 9994
12.5%
0 8532
 
10.7%
5 7865
 
9.8%
2 4682
 
5.9%
7 2931
 
3.7%
6 2909
 
3.6%
9 2904
 
3.6%
8 2818
 
3.5%
3 2779
 
3.5%
Other values (30) 22623
28.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 79952
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 11915
14.9%
- 9994
12.5%
0 8532
 
10.7%
5 7865
 
9.8%
2 4682
 
5.9%
7 2931
 
3.7%
6 2909
 
3.6%
9 2904
 
3.6%
8 2818
 
3.5%
3 2779
 
3.5%
Other values (30) 22623
28.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 79952
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 11915
14.9%
- 9994
12.5%
0 8532
 
10.7%
5 7865
 
9.8%
2 4682
 
5.9%
7 2931
 
3.7%
6 2909
 
3.6%
9 2904
 
3.6%
8 2818
 
3.5%
3 2779
 
3.5%
Other values (30) 22623
28.3%

Customer Name
Text

Missing 

Distinct793
Distinct (%)7.9%
Missing806
Missing (%)7.5%
Memory size711.6 KiB
2025-02-19T07:54:20.760433image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length22
Median length18
Mean length12.960676
Min length7

Characters and Unicode

Total characters129529
Distinct characters57
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st rowClaire Gute
2nd rowClaire Gute
3rd rowDarrin Van Huff
4th rowSean O'Donnell
5th rowSean O'Donnell
ValueCountFrequency (%)
michael 120
 
0.6%
frank 112
 
0.6%
john 107
 
0.5%
patrick 96
 
0.5%
stewart 93
 
0.5%
brian 93
 
0.5%
paul 92
 
0.5%
rick 91
 
0.5%
ken 91
 
0.5%
matt 86
 
0.4%
Other values (901) 19072
95.1%
2025-02-19T07:54:21.719151image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 12011
 
9.3%
e 11836
 
9.1%
n 10241
 
7.9%
10059
 
7.8%
r 9530
 
7.4%
i 7919
 
6.1%
l 6494
 
5.0%
o 5850
 
4.5%
t 5435
 
4.2%
s 4546
 
3.5%
Other values (47) 45608
35.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 129529
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 12011
 
9.3%
e 11836
 
9.1%
n 10241
 
7.9%
10059
 
7.8%
r 9530
 
7.4%
i 7919
 
6.1%
l 6494
 
5.0%
o 5850
 
4.5%
t 5435
 
4.2%
s 4546
 
3.5%
Other values (47) 45608
35.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 129529
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 12011
 
9.3%
e 11836
 
9.1%
n 10241
 
7.9%
10059
 
7.8%
r 9530
 
7.4%
i 7919
 
6.1%
l 6494
 
5.0%
o 5850
 
4.5%
t 5435
 
4.2%
s 4546
 
3.5%
Other values (47) 45608
35.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 129529
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 12011
 
9.3%
e 11836
 
9.1%
n 10241
 
7.9%
10059
 
7.8%
r 9530
 
7.4%
i 7919
 
6.1%
l 6494
 
5.0%
o 5850
 
4.5%
t 5435
 
4.2%
s 4546
 
3.5%
Other values (47) 45608
35.2%

Segment
Categorical

Missing 

Distinct3
Distinct (%)< 0.1%
Missing806
Missing (%)7.5%
Memory size674.2 KiB
Consumer
5191 
Corporate
3020 
Home Office
1783 

Length

Max length11
Median length8
Mean length8.8374024
Min length8

Characters and Unicode

Total characters88321
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowConsumer
2nd rowConsumer
3rd rowCorporate
4th rowConsumer
5th rowConsumer

Common Values

ValueCountFrequency (%)
Consumer 5191
48.1%
Corporate 3020
28.0%
Home Office 1783
 
16.5%
(Missing) 806
 
7.5%

Length

2025-02-19T07:54:22.065460image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-19T07:54:22.308944image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
consumer 5191
44.1%
corporate 3020
25.6%
home 1783
 
15.1%
office 1783
 
15.1%

Most occurring characters

ValueCountFrequency (%)
o 13014
14.7%
e 11777
13.3%
r 11231
12.7%
C 8211
9.3%
m 6974
7.9%
n 5191
 
5.9%
s 5191
 
5.9%
u 5191
 
5.9%
f 3566
 
4.0%
t 3020
 
3.4%
Other values (7) 14955
16.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 88321
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 13014
14.7%
e 11777
13.3%
r 11231
12.7%
C 8211
9.3%
m 6974
7.9%
n 5191
 
5.9%
s 5191
 
5.9%
u 5191
 
5.9%
f 3566
 
4.0%
t 3020
 
3.4%
Other values (7) 14955
16.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 88321
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 13014
14.7%
e 11777
13.3%
r 11231
12.7%
C 8211
9.3%
m 6974
7.9%
n 5191
 
5.9%
s 5191
 
5.9%
u 5191
 
5.9%
f 3566
 
4.0%
t 3020
 
3.4%
Other values (7) 14955
16.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 88321
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 13014
14.7%
e 11777
13.3%
r 11231
12.7%
C 8211
9.3%
m 6974
7.9%
n 5191
 
5.9%
s 5191
 
5.9%
u 5191
 
5.9%
f 3566
 
4.0%
t 3020
 
3.4%
Other values (7) 14955
16.9%

Country
Categorical

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing806
Missing (%)7.5%
Memory size714.8 KiB
United States
9994 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters129922
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States

Common Values

ValueCountFrequency (%)
United States 9994
92.5%
(Missing) 806
 
7.5%

Length

2025-02-19T07:54:22.774397image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-19T07:54:22.990663image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
united 9994
50.0%
states 9994
50.0%

Most occurring characters

ValueCountFrequency (%)
t 29982
23.1%
e 19988
15.4%
U 9994
 
7.7%
n 9994
 
7.7%
i 9994
 
7.7%
d 9994
 
7.7%
9994
 
7.7%
S 9994
 
7.7%
a 9994
 
7.7%
s 9994
 
7.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 129922
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 29982
23.1%
e 19988
15.4%
U 9994
 
7.7%
n 9994
 
7.7%
i 9994
 
7.7%
d 9994
 
7.7%
9994
 
7.7%
S 9994
 
7.7%
a 9994
 
7.7%
s 9994
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 129922
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 29982
23.1%
e 19988
15.4%
U 9994
 
7.7%
n 9994
 
7.7%
i 9994
 
7.7%
d 9994
 
7.7%
9994
 
7.7%
S 9994
 
7.7%
a 9994
 
7.7%
s 9994
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 129922
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 29982
23.1%
e 19988
15.4%
U 9994
 
7.7%
n 9994
 
7.7%
i 9994
 
7.7%
d 9994
 
7.7%
9994
 
7.7%
S 9994
 
7.7%
a 9994
 
7.7%
s 9994
 
7.7%

City
Text

Missing 

Distinct531
Distinct (%)5.3%
Missing806
Missing (%)7.5%
Memory size672.7 KiB
2025-02-19T07:54:23.626758image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length17
Median length14
Mean length9.3306984
Min length4

Characters and Unicode

Total characters93251
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)0.7%

Sample

1st rowHenderson
2nd rowHenderson
3rd rowLos Angeles
4th rowFort Lauderdale
5th rowFort Lauderdale
ValueCountFrequency (%)
city 994
 
7.0%
new 937
 
6.6%
york 920
 
6.5%
san 805
 
5.7%
los 747
 
5.2%
angeles 747
 
5.2%
philadelphia 537
 
3.8%
francisco 510
 
3.6%
seattle 428
 
3.0%
houston 377
 
2.6%
Other values (555) 7234
50.8%
2025-02-19T07:54:24.781683image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 8719
 
9.4%
a 7591
 
8.1%
o 7499
 
8.0%
i 6229
 
6.7%
n 6199
 
6.6%
l 5986
 
6.4%
s 4699
 
5.0%
r 4468
 
4.8%
t 4438
 
4.8%
4242
 
4.5%
Other values (41) 33181
35.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 93251
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 8719
 
9.4%
a 7591
 
8.1%
o 7499
 
8.0%
i 6229
 
6.7%
n 6199
 
6.6%
l 5986
 
6.4%
s 4699
 
5.0%
r 4468
 
4.8%
t 4438
 
4.8%
4242
 
4.5%
Other values (41) 33181
35.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 93251
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 8719
 
9.4%
a 7591
 
8.1%
o 7499
 
8.0%
i 6229
 
6.7%
n 6199
 
6.6%
l 5986
 
6.4%
s 4699
 
5.0%
r 4468
 
4.8%
t 4438
 
4.8%
4242
 
4.5%
Other values (41) 33181
35.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 93251
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 8719
 
9.4%
a 7591
 
8.1%
o 7499
 
8.0%
i 6229
 
6.7%
n 6199
 
6.6%
l 5986
 
6.4%
s 4699
 
5.0%
r 4468
 
4.8%
t 4438
 
4.8%
4242
 
4.5%
Other values (41) 33181
35.6%

State
Categorical

High correlation  Missing 

Distinct49
Distinct (%)0.5%
Missing806
Missing (%)7.5%
Memory size670.7 KiB
California
2001 
New York
1128 
Texas
985 
Pennsylvania
587 
Washington
506 
Other values (44)
4787 

Length

Max length20
Median length14
Mean length8.4871923
Min length4

Characters and Unicode

Total characters84821
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowKentucky
2nd rowKentucky
3rd rowCalifornia
4th rowFlorida
5th rowFlorida

Common Values

ValueCountFrequency (%)
California 2001
18.5%
New York 1128
 
10.4%
Texas 985
 
9.1%
Pennsylvania 587
 
5.4%
Washington 506
 
4.7%
Illinois 492
 
4.6%
Ohio 469
 
4.3%
Florida 383
 
3.5%
Michigan 255
 
2.4%
North Carolina 249
 
2.3%
Other values (39) 2939
27.2%
(Missing) 806
 
7.5%

Length

2025-02-19T07:54:25.283614image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
california 2001
17.1%
new 1322
 
11.3%
york 1128
 
9.6%
texas 985
 
8.4%
pennsylvania 587
 
5.0%
washington 506
 
4.3%
illinois 492
 
4.2%
ohio 469
 
4.0%
florida 383
 
3.3%
carolina 291
 
2.5%
Other values (43) 3542
30.3%

Most occurring characters

ValueCountFrequency (%)
a 10758
12.7%
i 9895
11.7%
n 8090
 
9.5%
o 7323
 
8.6%
r 5544
 
6.5%
e 5051
 
6.0%
l 4822
 
5.7%
s 4604
 
5.4%
C 2566
 
3.0%
f 2011
 
2.4%
Other values (36) 24157
28.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 84821
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 10758
12.7%
i 9895
11.7%
n 8090
 
9.5%
o 7323
 
8.6%
r 5544
 
6.5%
e 5051
 
6.0%
l 4822
 
5.7%
s 4604
 
5.4%
C 2566
 
3.0%
f 2011
 
2.4%
Other values (36) 24157
28.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 84821
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 10758
12.7%
i 9895
11.7%
n 8090
 
9.5%
o 7323
 
8.6%
r 5544
 
6.5%
e 5051
 
6.0%
l 4822
 
5.7%
s 4604
 
5.4%
C 2566
 
3.0%
f 2011
 
2.4%
Other values (36) 24157
28.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 84821
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 10758
12.7%
i 9895
11.7%
n 8090
 
9.5%
o 7323
 
8.6%
r 5544
 
6.5%
e 5051
 
6.0%
l 4822
 
5.7%
s 4604
 
5.4%
C 2566
 
3.0%
f 2011
 
2.4%
Other values (36) 24157
28.5%

Postal Code
Real number (ℝ)

High correlation  Missing 

Distinct630
Distinct (%)6.3%
Missing817
Missing (%)7.6%
Infinite0
Infinite (%)0.0%
Mean55245.233
Minimum1040
Maximum99301
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.5 KiB
2025-02-19T07:54:25.814222image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum1040
5-th percentile10009
Q123223
median57103
Q390008
95-th percentile98006
Maximum99301
Range98261
Interquartile range (IQR)66785

Descriptive statistics

Standard deviation32038.716
Coefficient of variation (CV)0.5799363
Kurtosis-1.4922003
Mean55245.233
Median Absolute Deviation (MAD)32929
Skewness-0.12997082
Sum5.5151316 × 108
Variance1.0264793 × 109
MonotonicityNot monotonic
2025-02-19T07:54:26.271179image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10035 263
 
2.4%
10024 230
 
2.1%
10009 229
 
2.1%
94122 203
 
1.9%
10011 193
 
1.8%
94110 166
 
1.5%
98105 165
 
1.5%
19134 160
 
1.5%
90049 151
 
1.4%
98103 151
 
1.4%
Other values (620) 8072
74.7%
(Missing) 817
 
7.6%
ValueCountFrequency (%)
1040 1
 
< 0.1%
1453 6
 
0.1%
1752 2
 
< 0.1%
1810 4
 
< 0.1%
1841 33
0.3%
1852 16
0.1%
1915 3
 
< 0.1%
2038 17
0.2%
2138 6
 
0.1%
2148 3
 
< 0.1%
ValueCountFrequency (%)
99301 6
 
0.1%
99207 7
 
0.1%
98661 5
 
< 0.1%
98632 3
 
< 0.1%
98502 5
 
< 0.1%
98270 2
 
< 0.1%
98226 3
 
< 0.1%
98208 1
 
< 0.1%
98198 7
 
0.1%
98115 112
1.0%

Region
Categorical

High correlation  Missing 

Distinct4
Distinct (%)< 0.1%
Missing806
Missing (%)7.5%
Memory size635.3 KiB
West
3203 
East
2848 
Central
2323 
South
1620 

Length

Max length7
Median length4
Mean length4.8594156
Min length4

Characters and Unicode

Total characters48565
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouth
2nd rowSouth
3rd rowWest
4th rowSouth
5th rowSouth

Common Values

ValueCountFrequency (%)
West 3203
29.7%
East 2848
26.4%
Central 2323
21.5%
South 1620
15.0%
(Missing) 806
 
7.5%

Length

2025-02-19T07:54:26.612292image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-19T07:54:26.939699image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
west 3203
32.0%
east 2848
28.5%
central 2323
23.2%
south 1620
16.2%

Most occurring characters

ValueCountFrequency (%)
t 9994
20.6%
s 6051
12.5%
e 5526
11.4%
a 5171
10.6%
W 3203
 
6.6%
E 2848
 
5.9%
C 2323
 
4.8%
n 2323
 
4.8%
r 2323
 
4.8%
l 2323
 
4.8%
Other values (4) 6480
13.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 48565
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
t 9994
20.6%
s 6051
12.5%
e 5526
11.4%
a 5171
10.6%
W 3203
 
6.6%
E 2848
 
5.9%
C 2323
 
4.8%
n 2323
 
4.8%
r 2323
 
4.8%
l 2323
 
4.8%
Other values (4) 6480
13.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 48565
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
t 9994
20.6%
s 6051
12.5%
e 5526
11.4%
a 5171
10.6%
W 3203
 
6.6%
E 2848
 
5.9%
C 2323
 
4.8%
n 2323
 
4.8%
r 2323
 
4.8%
l 2323
 
4.8%
Other values (4) 6480
13.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 48565
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
t 9994
20.6%
s 6051
12.5%
e 5526
11.4%
a 5171
10.6%
W 3203
 
6.6%
E 2848
 
5.9%
C 2323
 
4.8%
n 2323
 
4.8%
r 2323
 
4.8%
l 2323
 
4.8%
Other values (4) 6480
13.3%

Product ID
Text

Missing 

Distinct1862
Distinct (%)18.6%
Missing806
Missing (%)7.5%
Memory size728.0 KiB
2025-02-19T07:54:27.674570image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters149910
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)0.9%

Sample

1st rowFUR-BO-10001798
2nd rowFUR-CH-10000454
3rd rowOFF-LA-10000240
4th rowFUR-TA-10000577
5th rowOFF-ST-10000760
ValueCountFrequency (%)
off-pa-10001970 19
 
0.2%
tec-ac-10003832 18
 
0.2%
fur-fu-10004270 16
 
0.2%
tec-ac-10002049 15
 
0.2%
fur-ch-10001146 15
 
0.2%
fur-ch-10002647 15
 
0.2%
tec-ac-10003628 15
 
0.2%
off-bi-10001524 14
 
0.1%
fur-ch-10003774 14
 
0.1%
fur-fu-10001473 14
 
0.1%
Other values (1852) 9839
98.4%
2025-02-19T07:54:28.598988image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 35052
23.4%
- 19988
13.3%
F 15347
10.2%
1 14995
10.0%
O 6322
 
4.2%
2 4862
 
3.2%
4 4831
 
3.2%
3 4805
 
3.2%
A 4422
 
2.9%
5 3401
 
2.3%
Other values (17) 35885
23.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 149910
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 35052
23.4%
- 19988
13.3%
F 15347
10.2%
1 14995
10.0%
O 6322
 
4.2%
2 4862
 
3.2%
4 4831
 
3.2%
3 4805
 
3.2%
A 4422
 
2.9%
5 3401
 
2.3%
Other values (17) 35885
23.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 149910
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 35052
23.4%
- 19988
13.3%
F 15347
10.2%
1 14995
10.0%
O 6322
 
4.2%
2 4862
 
3.2%
4 4831
 
3.2%
3 4805
 
3.2%
A 4422
 
2.9%
5 3401
 
2.3%
Other values (17) 35885
23.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 149910
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 35052
23.4%
- 19988
13.3%
F 15347
10.2%
1 14995
10.0%
O 6322
 
4.2%
2 4862
 
3.2%
4 4831
 
3.2%
3 4805
 
3.2%
A 4422
 
2.9%
5 3401
 
2.3%
Other values (17) 35885
23.9%

Category
Categorical

High correlation  Missing 

Distinct3
Distinct (%)< 0.1%
Missing806
Missing (%)7.5%
Memory size712.9 KiB
Office Supplies
6026 
Furniture
2121 
Technology
1847 

Length

Max length15
Median length15
Mean length12.802582
Min length9

Characters and Unicode

Total characters127949
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFurniture
2nd rowFurniture
3rd rowOffice Supplies
4th rowFurniture
5th rowOffice Supplies

Common Values

ValueCountFrequency (%)
Office Supplies 6026
55.8%
Furniture 2121
 
19.6%
Technology 1847
 
17.1%
(Missing) 806
 
7.5%

Length

2025-02-19T07:54:29.088492image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-02-19T07:54:29.312821image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
ValueCountFrequency (%)
office 6026
37.6%
supplies 6026
37.6%
furniture 2121
 
13.2%
technology 1847
 
11.5%

Most occurring characters

ValueCountFrequency (%)
e 16020
12.5%
i 14173
11.1%
p 12052
9.4%
f 12052
9.4%
u 10268
 
8.0%
c 7873
 
6.2%
l 7873
 
6.2%
O 6026
 
4.7%
s 6026
 
4.7%
S 6026
 
4.7%
Other values (10) 29560
23.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 127949
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 16020
12.5%
i 14173
11.1%
p 12052
9.4%
f 12052
9.4%
u 10268
 
8.0%
c 7873
 
6.2%
l 7873
 
6.2%
O 6026
 
4.7%
s 6026
 
4.7%
S 6026
 
4.7%
Other values (10) 29560
23.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 127949
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 16020
12.5%
i 14173
11.1%
p 12052
9.4%
f 12052
9.4%
u 10268
 
8.0%
c 7873
 
6.2%
l 7873
 
6.2%
O 6026
 
4.7%
s 6026
 
4.7%
S 6026
 
4.7%
Other values (10) 29560
23.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 127949
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 16020
12.5%
i 14173
11.1%
p 12052
9.4%
f 12052
9.4%
u 10268
 
8.0%
c 7873
 
6.2%
l 7873
 
6.2%
O 6026
 
4.7%
s 6026
 
4.7%
S 6026
 
4.7%
Other values (10) 29560
23.1%

Sub-Category
Categorical

High correlation  Missing 

Distinct17
Distinct (%)0.2%
Missing806
Missing (%)7.5%
Memory size658.1 KiB
Binders
1523 
Paper
1370 
Furnishings
957 
Phones
889 
Storage
846 
Other values (12)
4409 

Length

Max length11
Median length9
Mean length7.191715
Min length3

Characters and Unicode

Total characters71874
Distinct characters28
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBookcases
2nd rowChairs
3rd rowLabels
4th rowTables
5th rowStorage

Common Values

ValueCountFrequency (%)
Binders 1523
14.1%
Paper 1370
12.7%
Furnishings 957
8.9%
Phones 889
8.2%
Storage 846
7.8%
Art 796
7.4%
Accessories 775
7.2%
Chairs 617
5.7%
Appliances 466
 
4.3%
Labels 364
 
3.4%
Other values (7) 1391
12.9%
(Missing) 806
7.5%

Length

2025-02-19T07:54:29.619605image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
binders 1523
15.2%
paper 1370
13.7%
furnishings 957
9.6%
phones 889
8.9%
storage 846
8.5%
art 796
8.0%
accessories 775
7.8%
chairs 617
6.2%
appliances 466
 
4.7%
labels 364
 
3.6%
Other values (7) 1391
13.9%

Most occurring characters

ValueCountFrequency (%)
s 9934
13.8%
e 8870
12.3%
r 7169
 
10.0%
i 5668
 
7.9%
n 5378
 
7.5%
a 4542
 
6.3%
o 3288
 
4.6%
p 3004
 
4.2%
h 2578
 
3.6%
c 2359
 
3.3%
Other values (18) 19084
26.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 71874
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 9934
13.8%
e 8870
12.3%
r 7169
 
10.0%
i 5668
 
7.9%
n 5378
 
7.5%
a 4542
 
6.3%
o 3288
 
4.6%
p 3004
 
4.2%
h 2578
 
3.6%
c 2359
 
3.3%
Other values (18) 19084
26.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 71874
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 9934
13.8%
e 8870
12.3%
r 7169
 
10.0%
i 5668
 
7.9%
n 5378
 
7.5%
a 4542
 
6.3%
o 3288
 
4.6%
p 3004
 
4.2%
h 2578
 
3.6%
c 2359
 
3.3%
Other values (18) 19084
26.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 71874
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 9934
13.8%
e 8870
12.3%
r 7169
 
10.0%
i 5668
 
7.9%
n 5378
 
7.5%
a 4542
 
6.3%
o 3288
 
4.6%
p 3004
 
4.2%
h 2578
 
3.6%
c 2359
 
3.3%
Other values (18) 19084
26.6%

Product Name
Text

Missing 

Distinct1850
Distinct (%)18.5%
Missing806
Missing (%)7.5%
Memory size964.7 KiB
2025-02-19T07:54:30.219530image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Length

Max length127
Median length78
Mean length36.914449
Min length5

Characters and Unicode

Total characters368923
Distinct characters85
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)0.9%

Sample

1st rowBush Somerset Collection Bookcase
2nd rowHon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back
3rd rowSelf-Adhesive Address Labels for Typewriters by Universal
4th rowBretford CR4500 Series Slim Rectangular Table
5th rowEldon Fold 'N Roll Cart System
ValueCountFrequency (%)
xerox 865
 
1.5%
x 701
 
1.3%
599
 
1.1%
with 599
 
1.1%
avery 557
 
1.0%
for 539
 
1.0%
binders 524
 
0.9%
chair 479
 
0.9%
black 426
 
0.8%
phone 374
 
0.7%
Other values (2798) 50371
89.9%
2025-02-19T07:54:31.325830image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
45654
 
12.4%
e 33538
 
9.1%
r 20791
 
5.6%
o 19902
 
5.4%
a 19064
 
5.2%
i 18648
 
5.1%
l 16365
 
4.4%
n 15622
 
4.2%
s 14683
 
4.0%
t 14550
 
3.9%
Other values (75) 150106
40.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 368923
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
45654
 
12.4%
e 33538
 
9.1%
r 20791
 
5.6%
o 19902
 
5.4%
a 19064
 
5.2%
i 18648
 
5.1%
l 16365
 
4.4%
n 15622
 
4.2%
s 14683
 
4.0%
t 14550
 
3.9%
Other values (75) 150106
40.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 368923
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
45654
 
12.4%
e 33538
 
9.1%
r 20791
 
5.6%
o 19902
 
5.4%
a 19064
 
5.2%
i 18648
 
5.1%
l 16365
 
4.4%
n 15622
 
4.2%
s 14683
 
4.0%
t 14550
 
3.9%
Other values (75) 150106
40.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 368923
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
45654
 
12.4%
e 33538
 
9.1%
r 20791
 
5.6%
o 19902
 
5.4%
a 19064
 
5.2%
i 18648
 
5.1%
l 16365
 
4.4%
n 15622
 
4.2%
s 14683
 
4.0%
t 14550
 
3.9%
Other values (75) 150106
40.7%

Sales
Real number (ℝ)

High correlation  Missing 

Distinct5825
Distinct (%)58.3%
Missing806
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean229.858
Minimum0.444
Maximum22638.48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.5 KiB
2025-02-19T07:54:31.747736image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0.444
5-th percentile4.98
Q117.28
median54.49
Q3209.94
95-th percentile956.98425
Maximum22638.48
Range22638.036
Interquartile range (IQR)192.66

Descriptive statistics

Standard deviation623.2451
Coefficient of variation (CV)2.7114353
Kurtosis305.31175
Mean229.858
Median Absolute Deviation (MAD)45.406
Skewness12.972752
Sum2297200.9
Variance388434.46
MonotonicityNot monotonic
2025-02-19T07:54:32.077342image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.96 56
 
0.5%
19.44 39
 
0.4%
15.552 39
 
0.4%
25.92 36
 
0.3%
10.368 36
 
0.3%
32.4 28
 
0.3%
6.48 21
 
0.2%
17.94 21
 
0.2%
20.736 19
 
0.2%
14.94 17
 
0.2%
Other values (5815) 9682
89.6%
(Missing) 806
 
7.5%
ValueCountFrequency (%)
0.444 1
 
< 0.1%
0.556 1
 
< 0.1%
0.836 1
 
< 0.1%
0.852 1
 
< 0.1%
0.876 1
 
< 0.1%
0.898 1
 
< 0.1%
0.984 1
 
< 0.1%
0.99 1
 
< 0.1%
1.044 1
 
< 0.1%
1.08 3
< 0.1%
ValueCountFrequency (%)
22638.48 1
< 0.1%
17499.95 1
< 0.1%
13999.96 1
< 0.1%
11199.968 1
< 0.1%
10499.97 1
< 0.1%
9892.74 1
< 0.1%
9449.95 1
< 0.1%
9099.93 1
< 0.1%
8749.95 1
< 0.1%
8399.976 1
< 0.1%

Quantity
Real number (ℝ)

Missing 

Distinct14
Distinct (%)0.1%
Missing806
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean3.7895737
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.5 KiB
2025-02-19T07:54:32.337792image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum14
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.2251097
Coefficient of variation (CV)0.58716622
Kurtosis1.9918894
Mean3.7895737
Median Absolute Deviation (MAD)1
Skewness1.2785448
Sum37873
Variance4.9511131
MonotonicityNot monotonic
2025-02-19T07:54:32.598873image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 2409
22.3%
2 2402
22.2%
5 1230
11.4%
4 1191
11.0%
1 899
 
8.3%
7 606
 
5.6%
6 572
 
5.3%
9 258
 
2.4%
8 257
 
2.4%
10 57
 
0.5%
Other values (4) 113
 
1.0%
(Missing) 806
 
7.5%
ValueCountFrequency (%)
1 899
 
8.3%
2 2402
22.2%
3 2409
22.3%
4 1191
11.0%
5 1230
11.4%
6 572
 
5.3%
7 606
 
5.6%
8 257
 
2.4%
9 258
 
2.4%
10 57
 
0.5%
ValueCountFrequency (%)
14 29
 
0.3%
13 27
 
0.2%
12 23
 
0.2%
11 34
 
0.3%
10 57
 
0.5%
9 258
 
2.4%
8 257
 
2.4%
7 606
5.6%
6 572
5.3%
5 1230
11.4%

Discount
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct12
Distinct (%)0.1%
Missing806
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean0.15620272
Minimum0
Maximum0.8
Zeros4798
Zeros (%)44.4%
Negative0
Negative (%)0.0%
Memory size84.5 KiB
2025-02-19T07:54:32.863667image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.2
Q30.2
95-th percentile0.7
Maximum0.8
Range0.8
Interquartile range (IQR)0.2

Descriptive statistics

Standard deviation0.20645197
Coefficient of variation (CV)1.3216925
Kurtosis2.4095461
Mean0.15620272
Median Absolute Deviation (MAD)0.2
Skewness1.6842947
Sum1561.09
Variance0.042622415
MonotonicityNot monotonic
2025-02-19T07:54:33.134022image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 4798
44.4%
0.2 3657
33.9%
0.7 418
 
3.9%
0.8 300
 
2.8%
0.3 227
 
2.1%
0.4 206
 
1.9%
0.6 138
 
1.3%
0.1 94
 
0.9%
0.5 66
 
0.6%
0.15 52
 
0.5%
Other values (2) 38
 
0.4%
(Missing) 806
 
7.5%
ValueCountFrequency (%)
0 4798
44.4%
0.1 94
 
0.9%
0.15 52
 
0.5%
0.2 3657
33.9%
0.3 227
 
2.1%
0.32 27
 
0.2%
0.4 206
 
1.9%
0.45 11
 
0.1%
0.5 66
 
0.6%
0.6 138
 
1.3%
ValueCountFrequency (%)
0.8 300
 
2.8%
0.7 418
 
3.9%
0.6 138
 
1.3%
0.5 66
 
0.6%
0.45 11
 
0.1%
0.4 206
 
1.9%
0.32 27
 
0.2%
0.3 227
 
2.1%
0.2 3657
33.9%
0.15 52
 
0.5%

Profit
Real number (ℝ)

High correlation  Missing 

Distinct7287
Distinct (%)72.9%
Missing806
Missing (%)7.5%
Infinite0
Infinite (%)0.0%
Mean28.656896
Minimum-6599.978
Maximum8399.976
Zeros65
Zeros (%)0.6%
Negative1871
Negative (%)17.3%
Memory size84.5 KiB
2025-02-19T07:54:33.443777image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Quantile statistics

Minimum-6599.978
5-th percentile-53.03092
Q11.72875
median8.6665
Q329.364
95-th percentile168.4704
Maximum8399.976
Range14999.954
Interquartile range (IQR)27.63525

Descriptive statistics

Standard deviation234.26011
Coefficient of variation (CV)8.1746504
Kurtosis397.18851
Mean28.656896
Median Absolute Deviation (MAD)10.77855
Skewness7.5614316
Sum286397.02
Variance54877.798
MonotonicityNot monotonic
2025-02-19T07:54:34.111444image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 65
 
0.6%
6.2208 43
 
0.4%
9.3312 38
 
0.4%
3.6288 32
 
0.3%
5.4432 32
 
0.3%
15.552 26
 
0.2%
12.4416 21
 
0.2%
7.2576 19
 
0.2%
3.1104 18
 
0.2%
9.072 11
 
0.1%
Other values (7277) 9689
89.7%
(Missing) 806
 
7.5%
ValueCountFrequency (%)
-6599.978 1
< 0.1%
-3839.9904 1
< 0.1%
-3701.8928 1
< 0.1%
-3399.98 1
< 0.1%
-2929.4845 1
< 0.1%
-2639.9912 1
< 0.1%
-2287.782 1
< 0.1%
-1862.3124 1
< 0.1%
-1850.9464 1
< 0.1%
-1811.0784 1
< 0.1%
ValueCountFrequency (%)
8399.976 1
< 0.1%
6719.9808 1
< 0.1%
5039.9856 1
< 0.1%
4946.37 1
< 0.1%
4630.4755 1
< 0.1%
3919.9888 1
< 0.1%
3177.475 1
< 0.1%
2799.984 1
< 0.1%
2591.9568 1
< 0.1%
2504.2216 1
< 0.1%

Interactions

2025-02-19T07:53:58.977297image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:52.762820image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:54.217371image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:55.692225image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:57.239869image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:59.365781image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:53.124159image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:54.573945image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:56.058949image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:57.594135image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:59.847623image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:53.397919image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:54.831961image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:56.407449image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:57.928383image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:54:00.332375image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:53.641602image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:55.069817image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:56.684675image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:58.373927image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:54:01.370045image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:53.891008image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:55.388954image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:56.935918image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
2025-02-19T07:53:58.716355image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/

Correlations

2025-02-19T07:54:34.685518image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
CategoryDiscountPostal CodeProfitQuantityRegionSalesSegmentShip ModeStateSub-Category
Category1.0000.3770.0000.0560.0000.0000.0720.0000.0000.0190.999
Discount0.3771.0000.051-0.543-0.0010.294-0.0570.0050.0270.3540.353
Postal Code0.0000.0511.000-0.0040.0140.921-0.0010.0350.0380.9680.000
Profit0.056-0.543-0.0041.0000.2340.0210.5180.0000.0050.0170.130
Quantity0.000-0.0010.0140.2341.0000.0000.3270.0120.0000.0040.000
Region0.0000.2940.9210.0210.0001.0000.0000.0000.0220.9980.000
Sales0.072-0.057-0.0010.5180.3270.0001.0000.0020.0000.0000.142
Segment0.0000.0050.0350.0000.0120.0000.0021.0000.0330.0900.000
Ship Mode0.0000.0270.0380.0050.0000.0220.0000.0331.0000.0960.007
State0.0190.3540.9680.0170.0040.9980.0000.0900.0961.0000.000
Sub-Category0.9990.3530.0000.1300.0000.0000.1420.0000.0070.0001.000

Missing values

2025-02-19T07:54:02.325181image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
A simple visualization of nullity by column.
2025-02-19T07:54:04.012906image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-02-19T07:54:05.424980image/svg+xmlMatplotlib v3.7.5, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCountryCityStatePostal CodeRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfit
01CA-2017-15215611/8/201711/11/2017Second ClassCG-12520Claire GuteConsumerUnited StatesHendersonKentucky42420.0SouthFUR-BO-10001798FurnitureBookcasesBush Somerset Collection Bookcase261.96002.00.0041.9136
12CA-2017-15215611/8/201711/11/2017Second ClassCG-12520Claire GuteConsumerUnited StatesHendersonKentucky42420.0SouthFUR-CH-10000454FurnitureChairsHon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back731.94003.00.00219.5820
23CA-2017-1386886/12/20176/16/2017Second ClassDV-13045Darrin Van HuffCorporateUnited StatesLos AngelesCalifornia90036.0WestOFF-LA-10000240Office SuppliesLabelsSelf-Adhesive Address Labels for Typewriters by Universal14.62002.00.006.8714
34US-2016-10896610/11/201610/18/2016Standard ClassSO-20335Sean O'DonnellConsumerUnited StatesFort LauderdaleFlorida33311.0SouthFUR-TA-10000577FurnitureTablesBretford CR4500 Series Slim Rectangular Table957.57755.00.45-383.0310
45US-2016-10896610/11/201610/18/2016Standard ClassSO-20335Sean O'DonnellConsumerUnited StatesFort LauderdaleFlorida33311.0SouthOFF-ST-10000760Office SuppliesStorageEldon Fold 'N Roll Cart System22.36802.00.202.5164
56CA-2015-1158126/9/20156/14/2015Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032.0WestFUR-FU-10001487FurnitureFurnishingsEldon Expressions Wood and Plastic Desk Accessories, Cherry Wood48.86007.00.0014.1694
67CA-2015-1158126/9/20156/14/2015Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032.0WestOFF-AR-10002833Office SuppliesArtNewell 3227.28004.00.001.9656
78CA-2015-1158126/9/20156/14/2015Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032.0WestTEC-PH-10002275TechnologyPhonesMitel 5320 IP Phone VoIP phone907.15206.00.2090.7152
89CA-2015-1158126/9/20156/14/2015Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032.0WestOFF-BI-10003910Office SuppliesBindersDXL Angle-View Binders with Locking Rings by Samsill18.50403.00.205.7825
910CA-2015-1158126/9/20156/14/2015Standard ClassBH-11710Brosina HoffmanConsumerUnited StatesLos AngelesCalifornia90032.0WestOFF-AP-10002892Office SuppliesAppliancesBelkin F5C206VTEL 6 Outlet Surge114.90005.00.0034.4700
Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCountryCityStatePostal CodeRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfit
10790YesUS-2018-147886NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10791YesUS-2018-147886NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10792YesUS-2018-147886NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10793YesUS-2018-147886NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10794YesUS-2018-147886NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10795YesUS-2018-147886NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10796YesUS-2018-147998NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10797YesUS-2018-151127NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10798YesUS-2018-155999NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
10799YesUS-2018-155999NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN

Duplicate rows

Most frequently occurring

Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCountryCityStatePostal CodeRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfit# duplicates
108YesCA-2018-100111NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN14
107YesCA-2017-165330NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN11
70YesCA-2016-164882NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9
19YesCA-2015-142769NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN8
160YesCA-2018-161956NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN8
202YesUS-2018-118087NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN8
6YesCA-2015-110786NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN7
28YesCA-2015-160766NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN7
79YesCA-2017-111682NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN7
96YesCA-2017-145261NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN7